Comment on “Detecting Novel Associations in Large Data Sets”

نویسندگان

  • Malka Gorfine
  • Ruth Heller
  • Yair Heller
چکیده

Reshef et al. presented a novel measure of dependence the maximal information coefficient (MIC) aimed to capture a wide range of associations between pairs of variables, and a statistical test for independence based on MIC. They defined a concept of equitability and claim that non-equitable methods are less practical for data exploration. By simple power comparisons, we show that this conclusion is wrong. ————————– As pointed out by Reshef et al. (Research Article, 17 Dec 2011, p. 1518), it is often the case that the pairwise relationship between many variables is simultaneously explored. In statistics, this exploration is formalized in a multiple hypothesis testing framework, where the null hypothesis of statistical independence is examined for every pair of variables. Then, the p-values of the tests serve as a basis for generating final conclusions. Specifically, the pairs of variables are ordered by their p-values (or the adjusted p-values after correcting for multiple testing) in increasing order, and the pairs with the lowest p-values will be further studied. Reshef et al. To whom correspondence should be addressed

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comment on “ Detecting Novel Associations in Large Data Sets ” by Reshef Et Al , Science Dec 16 , 2011

Reshef et al. presented a novel measure of dependence the maximal information coefficient (MIC) aimed to capture a wide range of associations between pairs of variables, and a statistical test for independence based on MIC. They defined a concept of equitability and claim that non-equitable methods are less practical for data exploration. By simple power comparisons, we show that this conclusio...

متن کامل

Application of Benford’s Law in Analyzing Geotechnical Data

Benford’s law predicts the frequency of the first digit of numbers met in a wide range of naturally occurring phenomena. In data sets, following Benford’s law, numbers are started with a small leading digit more often than those with a large leading digit. This law can be used as a tool for detecting fraud and abnormally in the number sets and any fabricated number sets. This can be used as an ...

متن کامل

Detecting novel associations in large data sets.

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination ...

متن کامل

How Can a Global Social Support System Hope to Achieve Fairer Competiveness?; Comment on “A Global Social Support System: What the International Community Could Learn From the United States’ National Basketball Association”

Ooms et al sets out some good general principles for a global social support system to improve fairer global competitiveness as a result of redistribution. This commentary sets out to summarize some of the conditions that would need to be satisfied for it to level up gradients in inequality through such a social support system, using the National Basketball Association (NBA) example as a point ...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012